{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zbmL45h7fWRT"
      },
      "source": [
        "# Create Document Summarization Agents with Mistral OCR & CAMEL-AI 🐫\n",
        "\n",
        "You can also check this cookbook in Colab [here](https://colab.research.google.com/drive/1ZwVmqa5vjpZ0C3H7k1XIseFfbCR4mq17?usp=sharing)\n",
        "\n",
        "In this cookbook, we’ll explore [**Mistral OCR**](https://mistral.ai/news/mistral-ocr)—a state-of-the-art Optical Character Recognition API that understands complex document layouts and extracts text, tables, images, and equations with unprecedented accuracy. We’ll show you how to:\n",
        "\n",
        "- Use the Mistral OCR API to convert scanned or image-based PDFs into structured Markdown\n",
        "- Leverage a Mistral LLM agent within CAMEL to summarize and analyze the extracted content\n",
        "- Build a seamless, end-to-end pipeline for retrieval-augmented generation (RAG), research, or business automation\n",
        "\n",
        "## Table of Contents\n",
        "\n",
        "1. 🧑🏻‍💻 Introduction\n",
        "2. ⚡️ Step-by-step Guide: Mistral OCR Extraction\n",
        "3. 💫 Quick Demo with Mistral Agent\n",
        "4. 🧑🏻‍💻 Conclusion\n",
        "\n",
        "<div class=\"align-center\">\n",
        "  <a href=\"https://www.camel-ai.org/\"><img src=\"https://i.postimg.cc/KzQ5rfBC/button.png\" width=\"150\"></a>\n",
        "  <a href=\"https://discord.camel-ai.org\"><img src=\"https://i.postimg.cc/L4wPdG9N/join-2.png\" width=\"150\"></a>\n",
        "  \n",
        "⭐ <i>Star us on [*Github*](https://github.com/camel-ai/camel), join our [*Discord*](https://discord.camel-ai.org) or follow our [*X*](https://x.com/camelaiorg)</i>\n",
        "</div>\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wbUMBBIB3LM3"
      },
      "source": [
        "![Slide 16_9 - 33.png]()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Bl90urdzuQLu"
      },
      "source": [
        "## **Introduction to Mistral OCR**\n",
        "\n",
        "Throughout history, advancements in information abstraction and retrieval have driven human progress—from hieroglyphs to digitization. Today, over 90% of organizational data lives in documents, often locked in complex layouts and multiple languages. **Mistral OCR** ushers in the next leap in document understanding: a multimodal API that comprehends every element—text, images, tables, equations—and outputs ordered, structured Markdown with embedded media references.\n",
        "\n",
        "#### **Key Features of Mistral OCR:**\n",
        "\n",
        "1. **State-of-the-art complex document understanding**  \n",
        "   - Extracts interleaved text, figures, tables, and mathematical expressions with high fidelity.\n",
        "\n",
        "2. **Natively multilingual & multimodal**  \n",
        "   - Parses scripts and fonts from across the globe, handling right-to-left layouts and non-Latin characters seamlessly.\n",
        "\n",
        "3. **Doc-as-prompt, structured output**  \n",
        "   - Returns ordered Markdown, embedding images and bounding-box metadata ready for RAG and downstream AI workflows.\n",
        "\n",
        "4. **Top-tier benchmarks & speed**  \n",
        "   - Outperforms leading OCR systems in accuracy—especially in math, tables, and multilingual tests—while delivering fast batch inference (∼2000 pages/min).\n",
        "\n",
        "5. **Scalable & flexible deployment**  \n",
        "   - Available via `mistral-ocr-latest` on Mistral’s developer suite, cloud partners, and on-premises self-hosting for sensitive data.\n",
        "\n",
        "Ready to unlock your documents? Let’s dive into the extraction guide.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "7p-JjpyNVcCT"
      },
      "source": [
        "First, install the CAMEL package with all its dependencies."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": true,
        "id": "I2X5A0LBc92C"
      },
      "outputs": [],
      "source": [
        "!pip install \"camel-ai[all]==0.2.61\""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "WCfZ6vA0iFQv"
      },
      "source": [
        "## ⚡️ Step-by-step Guide: Mistral OCR Loader"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FmuVvhpDyCT8"
      },
      "source": [
        "**Step 1: Set up your Mistral API key**\n",
        "\n",
        "If you don’t have a Mistral API key, you can obtain one by following these steps:\n",
        "\n",
        "1. **Create an account:**  \n",
        "   Go to [Mistral Console](https://console.mistral.ai/home) and sign up for an organization account.\n",
        "\n",
        "2. **Get your API key:**  \n",
        "   Once logged in, navigate to **Organization** → **API Keys**, generate a new key, copy it, and store it securely.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "cuZwSWBYyCT8"
      },
      "outputs": [],
      "source": [
        "import os\n",
        "from getpass import getpass\n",
        "\n",
        "mistral_api_key = getpass('Enter your Mistral API key: ')\n",
        "os.environ['MISTRAL_API_KEY'] = mistral_api_key\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "LUpFBEB9vrIz"
      },
      "source": [
        "**Step 2: Upload your PDF or image file for OCR**\n",
        "\n",
        "In a Colab or Jupyter environment, you can upload any PDF file directly:\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "dUVE7z9hwEV7"
      },
      "outputs": [],
      "source": [
        "# Colab file upload\n",
        "from google.colab import files\n",
        "\n",
        "uploaded = files.upload()\n",
        "# Grab the first uploaded filename\n",
        "file_path = next(iter(uploaded))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fwlB09epyeWK"
      },
      "source": [
        "**Step 3: Import and initialize the Mistral OCR loader**"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "SgIsQpOX82Bo"
      },
      "outputs": [],
      "source": [
        "# Importing the MistralReader class from the camel.loaders module\n",
        "# This class handles document processing using Mistral OCR capabilities\n",
        "from camel.loaders import MistralReader\n",
        "\n",
        "# Initializing an instance of MistralReader\n",
        "# This object will be used to submit tasks and manage OCR processing\n",
        "mistral_reader = MistralReader()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ecWmxX4ZyoS8"
      },
      "source": [
        "## Step 4: Obtain OCR output from Mistral\n",
        "\n",
        "Once the task completes, retrieve its output using the returned `task.id`.\n",
        "\n",
        "The output of **Mistral OCR** is a structured object:\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "rTKuFUXuyoS8"
      },
      "outputs": [],
      "source": [
        "# Retrieve the OCR output\n",
        "# CORRECT: Just use extract_text for local files or URLs\n",
        "ocr_response = mistral_reader.extract_text(file_path)\n",
        "print(ocr_response)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "65Z2nDTfUM0P"
      },
      "source": [
        "## 💫 Quick Demo with CAMEL Agent\n",
        "\n",
        "Here we choose Mistral model for our demo. If you'd like to explore different models or tools to suit your needs, feel free to visit the [CAMEL documentation page](https://docs.camel-ai.org/), where you'll find guides and tutorials.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "AtaiHz437A2_"
      },
      "source": [
        "If you don't have a Mistral API key, you can obtain one by following these steps:\n",
        "\n",
        "1. Visit the Mistral Console (https://console.mistral.ai/)\n",
        "\n",
        "2. In the left panel, click on API keys under API section\n",
        "\n",
        "3. Choose your plan\n",
        "\n",
        "For more details, you can also check the Mistral documentation: https://docs.mistral.ai/getting-started/quickstart/"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "1mKu0Dgj5TV9"
      },
      "outputs": [],
      "source": [
        "from camel.configs import MistralConfig\n",
        "from camel.models import ModelFactory\n",
        "from camel.types import ModelPlatformType, ModelType\n",
        "\n",
        "mistral_model = ModelFactory.create(\n",
        "    model_platform=ModelPlatformType.MISTRAL,\n",
        "    model_type=ModelType.MISTRAL_LARGE,\n",
        "    model_config_dict=MistralConfig(temperature=0.0).as_dict(),\n",
        ")\n",
        "\n",
        "# Use Mistral model\n",
        "model = mistral_model"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "cPOelPF2UM0P"
      },
      "outputs": [],
      "source": [
        "from camel.agents import ChatAgent\n",
        "\n",
        "# Initialize a ChatAgent\n",
        "agent = ChatAgent(\n",
        "    system_message=\"You are a helpful document assistant.\",  # Define the agent's role\n",
        "    model=mistral_model\n",
        ")\n",
        "\n",
        "# Use the ChatAgent to generate insights based on the OCR output\n",
        "response = agent.step(\n",
        "    f\"Based on the following OCR-extracted content, give me a concise conclusion of the document:\\n{ocr_response}\"\n",
        ")\n",
        "print(response.msgs[0].content)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "p3VfU-hdX4Ti"
      },
      "source": [
        "**For advanced usage of RAG capabilities with large files, please refer to our [RAG cookbook](https://docs.camel-ai.org/cookbooks/advanced_features/agents_with_rag#rag-cookbook).**"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "sR4BU5Z_oanP"
      },
      "source": [
        "## 🧑🏻‍💻 Conclusion\n",
        "\n",
        "In conclusion, integrating **Mistral OCR** within CAMEL-AI revolutionizes the process of document data extraction and preparation, enhancing your capabilities for AI-driven applications. With Mistral OCR’s robust features—state-of-the-art complex document understanding, natively multilingual & multimodal parsing, and doc-as-prompt structured Markdown output—you can seamlessly process complex PDFs and images into machine-readable formats optimized for LLMs, directly feeding into CAMEL-AI’s multi-agent workflows. This integration not only simplifies data preparation but also empowers intelligent and accurate analytics at scale. With these tools at your disposal, you’re equipped to transform raw document data into actionable insights, unlocking new possibilities in automation and AI-powered decision-making.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oA81T8-ToaWz"
      },
      "source": [
        "That's everything: Got questions about 🐫 CAMEL-AI? Join us on [Discord](https://discord.camel-ai.org)! Whether you want to share feedback, explore the latest in multi-agent systems, get support, or connect with others on exciting projects, we’d love to have you in the community! 🤝\n",
        "\n",
        "Check out some of our other work:\n",
        "\n",
        "1. 🐫 Creating Your First CAMEL Agent [free Colab](https://colab.research.google.com/drive/1cmWPxXEsyMbmjPhD2bWfHuhd_Uz6FaJQ?usp=sharing)\n",
        "\n",
        "2.  Graph RAG Cookbook [free Colab](https://colab.research.google.com/drive/1uZKQSuu0qW6ukkuSv9TukLB9bVaS1H0U?usp=sharing)\n",
        "\n",
        "3. 🧑‍⚖️ Create A Hackathon Judge Committee with Workforce [free Colab](https://colab.research.google.com/drive/18ajYUMfwDx3WyrjHow3EvUMpKQDcrLtr?usp=sharing)\n",
        "\n",
        "4. 🔥 3 ways to ingest data from websites with Firecrawl & CAMEL [free Colab](https://colab.research.google.com/drive/1lOmM3VmgR1hLwDKdeLGFve_75RFW0R9I?usp=sharing)\n",
        "\n",
        "5. 🦥 Agentic SFT Data Generation with CAMEL and Mistral Models, Fine-Tuned with Unsloth [free Colab](https://colab.research.google.com/drive/1lYgArBw7ARVPSpdwgKLYnp_NEXiNDOd-?usp=sharingg)\n",
        "\n",
        "Thanks from everyone at 🐫 CAMEL-AI\n",
        "\n",
        "\n",
        "<div class=\"align-center\">\n",
        "  <a href=\"https://www.camel-ai.org/\"><img src=\"https://i.postimg.cc/KzQ5rfBC/button.png\"width=\"150\"></a>\n",
        "  <a href=\"https://discord.camel-ai.org\"><img src=\"https://i.postimg.cc/L4wPdG9N/join-2.png\"  width=\"150\"></a></a>\n",
        "  \n",
        "⭐ <i>Star us on <a href=\"https://github.com/camel-ai/camel\">Github</a> </i>, join our [*Discord*](https://discord.camel-ai.org) or follow our [*X*](https://x.com/camelaiorg)  ⭐\n",
        "</div>\n"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
